Search CORE

75 research outputs found

On clustering procedures and nonparametric mixture estimation

Author: Auray Stéphane
Klutchnikoff Nicolas
Rouvière Laurent
Publication venue
Publication date: 01/01/2015
Field of study

This paper deals with nonparametric estimation of conditional den-sities in mixture models in the case when additional covariates are available. The proposed approach consists of performing a prelim-inary clustering algorithm on the additional covariates to guess the mixture component of each observation. Conditional densities of the mixture model are then estimated using kernel density estimates ap-plied separately to each cluster. We investigate the expected L 1 -error of the resulting estimates and derive optimal rates of convergence over classical nonparametric density classes provided the clustering method is accurate. Performances of clustering algorithms are measured by the maximal misclassification error. We obtain upper bounds of this quantity for a single linkage hierarchical clustering algorithm. Lastly, applications of the proposed method to mixture models involving elec-tricity distribution data and simulated data are presented

arXiv.org e-Print Archive

Statistical analysis of $k$ -nearest neighbor collaborative recommendation

Author: Biau Gérard
Cadre Benoît
Rouvière Laurent
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

Collaborative recommendation is an information-filtering technique that attempts to present information items that are likely of interest to an Internet user. Traditionally, collaborative systems deal with situations with two types of variables, users and items. In its most common form, the problem is framed as trying to estimate ratings for items that have not yet been consumed by a user. Despite wide-ranging literature, little is known about the statistical properties of recommendation systems. In fact, no clear probabilistic model even exists which would allow us to precisely describe the mathematical forces driving collaborative filtering. To provide an initial contribution to this, we propose to set out a general sequential stochastic model for collaborative recommendation. We offer an in-depth analysis of the so-called cosine-type nearest neighbor collaborative method, which is one of the most widely used algorithms in collaborative filtering, and analyze its asymptotic performance as the number of users grows. We establish consistency of the procedure under mild assumptions on the model. Rates of convergence and examples are also provided.Comment: Published in at http://dx.doi.org/10.1214/09-AOS759 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Silicon nanowires as negative electrode for lithium-ion microbatteries

Author: Cojocaru Costel Sorin
Eude Laurent
Laik Barbara
Pereira-Ramos Jean Pierre
Pribat Didier
Rouvière Emmanuelle
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

International audienceThe increasingly demand on secondary batteries with higher speciﬁc energy densities requires the replace- ment of the actual electrode materials. With a very high theoretical capacity (4200 mAh g−1 ) at low voltage, silicon is presented as a very interesting potential candidate as negative electrode for lithium-ion micro- batteries. For the ﬁrst time, the electrochemical lithium alloying/de-alloying process is proven to occur, respectively, at 0.15 V/0.45 V vs. Li+ /Li with Si nanowires (SiNWs, 200-300 nm in diameter) synthesized by chemical vapour deposition. This new three-dimensional architecture material is well suited to accom- modate the expected large volume expansion due to the reversible formation of Li-Si alloys. At present, stable capacity over ten to twenty cycles is demonstrated. The storage capacity is shown to increase with the growth temperature by a factor 3 as the temperature varies from 525 to 575 ◦ C. These results, showing an attractive working potential and large storage capacities, open up a new promising ﬁeld of research

HAL-CEA

HAL-Polytechnique

HAL - UPEC / UPEM

Functional supervised classification with wavelets

Author: Berlinet Alain
Biau Gérard
Rouvière Laurent
Publication venue: Publications de l’Institut de Statistique de l’Université de Paris
Publication date: 01/01/2008
Field of study

International audienc

HAL Descartes

Hal-Diderot

HAL-Rennes 1

Optimal bandwidth selection for variable kernel density estimates

Author: Berlinet Alain
Biau Gérard
Rouvière Laurent
Publication venue: 'Elsevier BV'
Publication date: 01/01/2005
Field of study

International audienceIt is well established that one can improve performance of kernel density estimates by varying the bandwidth with the location and/or the sample data at hand. Our interest in this paper is in the data-based selection of a variable bandwidth within an appropriate parameterized class of functions. We present an automatic selection procedure inspired by the combinatorial tools developed in Devroye and Lugosi (2001). It is shown that the expected L 1 error of the corresponding selected estimate is up to a given constant multiple of the best possible error plus an additive term which tends to zero under mild assumptions

HAL Descartes

Hal-Diderot

Nonparametric Forecasting of the Manufacturing Output Growth with Firm-level Survey Data

Author: Gérard Biau
Laurent Rouvière
Olivier Biau
Publication venue
Publication date
Field of study

A large majority of summary indicators derived from the individual responses to qualitative Business Tendency Surveys (which are mostly three-modality questions) result from standard aggregation and quantification methods. This is typically the case for the indicators called balances of opinion, which are currently used in short term analysis and considered by forecasters as explanatory variables in many models. In the present paper, we discuss a new statistical approach to forecast the manufacturing growth from firm-survey responses. We base our predictions on a forecasting algorithm inspired by the random forest regression method, which is known to enjoy good prediction properties. Our algorithm exploits the heterogeneity of the survey responses, works fast, is robust to noise and allows for the treatment of missing values. Starting from a real application on a French dataset related to the manufacturing sector, this procedure appears as a competitive method compared with traditional algorithms.Business Tendency Surveys, balance of opinion, short-term forecasting, manufactured production, k-nearest neighbor regression, random forecasts

Crossref

Research Papers in Economics

Optimal L1 bandwidth selection for variable kernel density estimates

Author: Berlinet Alain
Biau Gérard
Rouvière Laurent
Publication venue
Publication date
Field of study

It is well-established that one can improve performance of kernel density estimates by varying the bandwidth with the location and/or the sample data at hand. Our interest in this paper is in the data-based selection of a variable bandwidth within an appropriate parameterized class of functions. We present an automatic selection procedure inspired by the combinatorial tools developed in Devroye and Lugosi [2001. Combinatorial Methods in Density Estimation. Springer, New York]. It is shown that the expected L1 error of the corresponding selected estimate is up to a given constant multiple of the best possible error plus an additive term which tends to zero under mild assumptions.Variable kernel estimate Nonparametric estimation Partition Shatter coefficient

Research Papers in Economics